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DETAILED ACTION 
Response to Arguments 

1 . Applicant's arguments, see p. 8, filed 3/12/08, with respect to claims 30-33 have 
been fully considered and are persuasive. The rejection under 35 USC 102(e) of claims 
30-33 has been withdrawn. 

2. Regarding claim 30, see the new rejection under 35 USC 103(a). 
Regarding Gjerdingen, it is the examiner's belief that human processing and 

digital signal processing (DSP) methods are taught and it is obvious to combine them in 
one embodiment (see column 6, lines 38-53, wherein "Data describing music attributes 
may also be collected by [DSP] and stored as DSP data 403B...."). Furthermore, it is 
believed that the spectral properties class as classified by human classification is taught 
by Gjerdingen. Compare the applicant's specification, p. 4, line 30 - p. 5, line 5 and p. 
14, line 23 - p. 15, line 3, to Gjerdingen's teachings in figures 7A1 and 7A2. They 
appear to be equivalent. Likewise Gjerdingen's teachings, from column 14, line 36 to 
column 15, line 3, appear to teach spectral properties characteristics as determined by 
digital signal processing. 

3. Regarding claims 31 and 32, see the preceding argument with respect to claim 
30. The rejection can be found in the following under 35 USC 103. 

4. Applicant's arguments with respect to claims 1-13, 15, 17, 21-26, 34, and 35 
have been considered but are moot in view of the new ground(s) of rejection. 

5. Regarding claim 1, see the new rejection with respect to the amended features. 
Also, the examiner respectfully disagrees, wherein Blum teaches comparison of a 
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"spectral feature vector" to the classification chain (see column 23, line 56 - column 24, 
line 7, specifically wherein "This vector is compared to the vectors for each class 
derived in the training process...." leads the examiner to believe that Blum teaches 
these features). 

6. Regarding claims 2-13, 15, 17, 21-26, 34, and 35, see the preceding argument 
with respect to claims 30 and 1 . The new rejections can be found in the following under 
35 USC 103. 



Claim Rejections - 35 USC § 103 

7. The text of those sections of Title 35, U.S. Code not included in this action can 
be found in a prior Office action. 

8. Claims 1-13, 15 and 17 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over the combination of Blum in view of Kjaer and Gjerdingen (all 
previously cited). 

9. Regarding claim 1, see Blum 

A method for automatically classifying spectral properties of audio data, comprising: 

applying input audio data (1) to a critical band filtering process to form first output data and (2) to 

an entropy calculation process to form second output data; (column 6, lines 24-28) 

applying the first output data to a first derivative process to form third output data; (column 6, 

lines 28-30) and 

inputting said first, second and third output data to an averaging process to form a spectral 
feature vector representing the input audio data (column 6, lines 32-35 and lines 45-48); and 

comparing the spectral feature vector to a classification chain containing pre-classified entries to 
determine at least one classification of the audio data (column 21 , line 55 - column 22, line 20 
and column 23, line 56 - column 24, line 7) wherein the classification chain data comprises a 
plurality of classification vectors, wherein each vector includes data representative of a spectral properties 
class as classified by humans and spectral properties characteristics as determined by digital signal 
processing. 
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Blum teaches a method for automatically classifying spectral properties of audio data, 
wherein a feature vector is created with the above features. The critical band filtering 
process, as taught by Blum, is a Mel-frequency cepstral coefficient process. Blum does 
not teach the entropy calculation for use in a feature vector, however Blum has 
described a feature vector with a plurality of metrics. Kjaer teaches an entropy 
calculation, wherein a musical tone is classified by notes and accidentals (see Abstract 
and column 4, line 55 - column 7, line 34). Kjaer teaches that entropy is useful in 
classifying information composed of random processes, or processes that can be better 
understood using probability theory. It would have been obvious for one of ordinary skill 
in the art at the time of the invention to combine the teachings of Blum and Kjaer for the 
purpose of better classification. However, the combination of Blum and Kjaer does not 
appear to teach the combination of classification by humans and classification by DSP. 

Gjerdingen teaches a classification system that uses listener's responses to 
classify audio performances (column 3, lines 23-61, column 12, line 23 - column 14, line 
35, figure 4, units 401 , 403, and 404, and figures 7A1 -7A2). Gjerdingen also teaches 
digital signal processing to classify audio performances (column 9, lines 28-39, column 
14, line 36 - column 15, line 6, and figure 4, unit 403B). Gjerdingen appears to teach 
each classification system separately, wherein it is stated that DSP techniques "may be 
used", however it would have been obvious to combine these methods to create a 
system to perform both human classification and DSP classification for better results in 
classification. A Press Release, titled "We Know What You Like: Can sites like 
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MongoMusic and LAUNCHcast really tell your musical tastes?" published in The 
Industry Standard on 3/7/2000 

(http://web.archive.org/web/200008241 12802/www.mongomusic.com/s/press_standard 
_030700), available from MongoMusic.com as archived by the Wayback Machine 
(http://www.archive.org/index.php) provides evidence that it was obvious at the time of 
the invention to create a "semi-automated, semi-human-based system." (see p. 2 of the 
Press Release). It would have been obvious for one of ordinary skill in the art at the 
time of the invention to combine the teachings of Blum, Kjaer, and Gjerdingen for the 
purpose of better classification of audio performances. 

1 0. Regarding claim 2, the further limitation of claim 1 , see Blum 

... wherein the audio data is divided into frames, and the method is performed frame by frame, (column 
6, lines 56-58) 

In the combination, Blum teaches the division of audio data into frames, wherein the 
method is performed frame by frame. 

1 1 . Regarding claim 3, the further limitation of claim 1 , see 

... further including calculating root mean squared values of the input audio data, (column 8, lines 1 - 
3) 

In the combination, Blum teaches RMS values. 

12. Regarding claim 4, the further limitation of claim 2, see Kjaer 

. . . wherein said entropy calculation process includes calculating: 
S = -YwPwlog 2 (p w ) 

where S is the entropy of the frame, p w is the normalized magnitude of a bin w of the audio data, and 
log 2 (p w ) is the log base 2 of (p w ). (column 5, lines 5-12 and equation H(x)) 

Kjaer teaches this entropy measure. 
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1 3. Regarding claim 5, the further limitation of claim 2, see the preceding argument 
with respect to claim 3. Blum teaches the square root of the sum of squares, where the 
square root is a mapping function and adjusts the scale of the function. 

14. Regarding claim 6, the further limitation of claim 2, see the preceding argument 
with respect to claim 1 . The combination teaches this feature. 

1 5. Regarding claim 7, the further limitation of claim 1 , see the preceding argument 
with respect to claim 1 . The combination teaches a frequency domain transform. 

16. Regarding claim 8, the further limitation of claim 7, see Blum 

... wherein said converting of the input audio data signal from the time domain to the frequency domain 
includes performing a fast fourier transform on the audio data, (column 7, lines 56-61 ) 

In the combination, Blum teaches an FFT. 

1 7. Regarding claim 9, the further limitation of claim 2, see the preceding argument 
with respect to claim 1 . The combination teaches dividing the input signal into frames 
and averaging the features over all the frames. 

18. Regarding claim 10, the further limitation of claim 1, seethe preceding argument 
with respect to claim 1 . The combination teaches a classification process using the 
feature vector, and this classification process determines a property class that describes 
the audio data (column 6, lines 7-10). 

1 9. Regarding claim 1 1 , the further limitation of claim 1 , see the preceding argument 
with respect to claim 1 . In the combination, Blum teaches a feature vector, and Blum 
teaches that a vector is a row vector and not an NxM array (column 5, lines 52-55). 
Blum teaches a 1xN array, wherein it is inherent that N can be 25. 

20. Regarding claim 12, the further limitation of claim 1 , see Blum 
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... wherein the audio data is formatted according to pulse code modulated format, (column 5, lines 
24-50 and lines 64-66) 

In the combination, Blum teaches a plurality of input devices in the system, wherein it is 
well known that optical disks containing audio data are encoded in a PCM format. 
Inherently Blum teaches this feature. 

21 . Regarding claim 13, the further limitation of claim 12, see the preceding 
argument with respect to claim 1 2. In the combination, Blum teaches the use of a 
microphone and further teaches that a sound produced into the microphone can be 
searched (column 3, lines 52-55). It is inherent that the digitization step converts the 
analog waveform to a PCM format. 

22. Regarding claim 15, the further limitation of claim 12, see the preceding 
argument with respect to claim 8. The combination teaches an FFT operation, which is 
performed on the audio data. 

23. Regarding claim 17, the further limitation of claim 1 , see Gjerdingen 

. . . further comprising performing a principal component analysis process on the spectral feature vector. 
(column 15, lines 37-44) 

Blum teaches a refining process on the feature vector, but does not teach principal 
component analysis (PCA). Gjerdingen teaches that PCA is used to reduce the 
complexity of the data being analyzed. 

24. Claims 21-26 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
the combination of Blum and Gjerdingen. 

25. Regarding claim 21, see Blum 
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A method of classifying data according to spectral properties of the data, comprising: 

assigning at least one spectral properties class to each media entity of a plurality of media entities 
in a data set wherein said assigning is not based on digital signal processing; (column 21 , lines 55- 
58, line 64 - column 22, line 3 and column 22, lines 31-33) 

processing each media entity of said data set to extract at least one spectral properties 
characteristic based on digital signal processing of each media entity; (column 22, lines 45-48) 

generating a plurality of spectral properties vectors for said plurality of media entities, wherein 
each spectral properties vector includes said at least one spectral properties class and at least one 
spectral properties characteristic based on digital signal processing; and (column 22, lines 48-50) 

forming a classification chain based upon said plurality of spectral properties vectors and the at 
least one spectral properties class (column 22, lines 55-65) ; and 

comparing unclassified data to the classification chain to estimate a classification of the 
unclassified data (column 21 , line 55 - column 22, line 20 and column 22, line 56 - column 
23, line 7) wherein the classification chain data comprises a plurality of classification vectors, wherein 
each vector includes data representative of a spectral properties class as classified by humans and 
spectral properties characteristics as determined by digital signal processing. 

Blum teaches a method equivalent of classifying data according to its spectral 
properties and class with these features. However, Blum teaches a disjointed 
approach, wherein the spectral properties class assigning that is not based on digital 
signal processing and the extraction of the spectral properties characteristic based on 
digital signal processing are not taught to be performed together in classifying (i.e. Blum 
teaches the use of DSP when the non-DSP classification method fails and does not 
positively say they are used together to classify signals). 

Gjerdingen teaches the use of DSP and non-DSP classification methods together 
to model, or classify, the signals (column 6, lines 38-64 and figure 4, units 401 , 403A, 
403B, and 404-406). It would have been obvious for one of ordinary skill in the art at 
the time of the invention to combine the teachings of Blum and Gjerdingen for the 
purpose of placing music under many searchable elements (i.e. searching by artist, 
mood, genre, sub-genre, etc..) (Gjerdingen, column 3, lines 23-67, column 8, lines 34- 
40, and lines 50-57). 

26. Regarding claim 22, the further limitation of claim 21 , see 
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. . . further comprising: 

processing an unclassified media entity to extract at least one spectral properties characteristic 
based on digital signal processing of the unclassified media entity; (column 21 , lines 55-58) 

generating a vector for the unclassified media entity including said at least one digital signal 
processing spectral properties characteristic; (column 21, lines 58-60) 

presenting the vector for the unclassified media entity to the classification chain; and 

classifying the unclassified entry with an estimate of the spectral properties class by calculating 
the representative spectral properties class of the subset of the plurality of vectors of the classification 
chain located in the neighborhood of the vector for the unclassified entity, (column 21 , line 66 - 
column 22, line 3) 

Blum teaches these features in a method of classifying data. 

27. Regarding claim 23, the further limitation of claim 22, see Blum 

. . . further including calculating a neighborhood distance that defines a distance within which two vectors 
in the classification chain space are in the same neighborhood for purposes of being in the same spectral 
properties class, (column 22, lines 3-20) 

Blum teaches a calculation of a neighborhood distance. 

28. Regarding claim 24, the further limitation of claim 22, see the preceding 
argument with respect to claim 23. Blum teaches classifying the entries according to 
statistical properties of the spectral properties of an entry, such as standard deviations 
or range values (column 21 , lines 61-63). It is inherent to use the median to describe 
skewed sample ranges (column 22, lines 21-26). 

29. Regarding claim 25, the further limitation of claim 22, see the preceding 
argument with respect to claim 23. Blum teaches a method of describing an 
unclassified entry according to a numerical value with these features. 

30. Regarding claim 26, the further limitation of claim 22, see the preceding 
argument with respect to claim 31 . Blum teaches the features of the parent claims 21 
and 22, but Blum does not teach a level of confidence measure. Gjerdingen teaches a 
measure indicating the level of confidence regarding classification. 
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31 . Claims 30-32 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Gjerdingen. 

32. Regarding claim 30, see Gjerdingen 

A computing system, comprising: 

a computing device including: 

a classification chain data structure stored thereon having a plurality of classification 
vectors, wherein each vector includes data representative of a spectral properties class as 
classified by humans and spectral properties characteristics as determined by digital signal 
processing; and (column 3, lines 23-61 and column 9, lines 28-39) 

processing means for comparing an unclassified media entity to the classification chain 
data structure to determine an estimate of the spectral properties class of the unclassified media 
entity (column 6, line 38 - column 7, line 2). 

Gjerdingen teaches a computing system with these features to create a searchable 

database. Gjerdingen teaches human and machine classification (figure 4, items, 401, 

403 and 403B and column 6, lines 38-64), wherein the classification vector may collect 

DSP data (column 6, lines 48-50, column 14, line 36 - column 15, line 7, and figure 4, 

unit 404). It is obvious to collect both of these data and collect the data in the "acquired 

data" for the purpose of creating better classifications of the analyzed data. A Press 

Release, titled "We Know What You Like: Can sites like MongoMusic and LAUNCHcast 

really tell your musical tastes?" published in The Industry Standard on 3/7/2000 

(http://web.archive.org/web/200008241 12802/www.mongomusic.com/s/press_standard 

_030700), available from MongoMusic.com as archived by the Wayback Machine 

(http://www.archive.org/index.php) provides evidence that it was obvious at the time of 

the invention to create a "semi-automated, semi-human-based system." (see p. 2 of the 

Press Release). 

33. Regarding claim 31 , the further limitation of claim 30, see Gjerdingen 
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... wherein said determining of an estimate of the spectral properties class includes returning at least one 
number indicating the level of confidence of the spectral properties class assignment, (column 1 0, 
lines 53-57) 

Gjerdingen teaches a level of confidence indicator. 

34. Regarding claim 32, the further limitation of claim 31 , see the preceding 
argument with respect to claims 30 and 31 . It is inherent that a system using the 
method taught by Gjerdingen will undergo an improvement in classification with experts 
review and more data samples (column 8, lines 19-24). 



35. Claims 34 and 35 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Blum in view of Gjerdingen and Bahl (previously cited). 

36. Regarding claim 34, Blum teaches a method for classifying audio data according 
to its spectral properties (abstract), comprising: 

classifying by human experts each entry of a representative set of sounds according to their 
spectral perceptual qualities (column 3, lines 30-33, wherein users create classes); 

assigning each entry in the representative set at least one value based on digital signal 
processing (column 3, lines 4-21); 

reducing the results to a set of numbers called the characteristic vector of each sound (column 
6, lines 24-52); 

storing the characteristic vector in a classification chain (column 6, lines 35-36); 

receiving a digital audio information (column 6, lines 13-15); 

dividing the digital audio information into frames (column 6, lines 56-58); 

determining a sonic characterization vector as a function of the energy, entropy and rate of 
change of frequencies in at least one frame (column 6, lines 24-30, column 7, line 61 - column 
8, line 6 and column 8, lines 53-55); and 

presenting the characteristic vector to the classification chain, which returns an estimate of the 
spectral properties (column 6, lines 45-52). 

Blum teaches human classification of the audio data, wherein the user of the system 
groups sounds into classes. Blum also teaches a sonic characterization vector as a 
function of the energy (i.e. sum of squares of the magnitude spectrum) and the rate of 
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change frequencies (i.e. the first derivative of a pitch data set, or "trajectory") in at least 
one frame. Blum does not teach classification by a human expert, nor does Blum teach 
characterization by an entropy calculation. 

Gjerdingen teaches a similar classification method (abstract). Specifically, 
Gjerdingen uses an expert user's opinion for creating "classes" (column 3, lines 44-61). 
It would have been obvious for one of ordinary skill in the art at the time of the invention 
to combine the teachings of Blum and Gjerdingen for the purpose of utilizing an expert 
musician's opinion to create better definitions between classes. The combination, 
however, does not teach the use of an entropy calculation to characterize sonic qualities 
of frames. 

Bahl teaches a method of partitioning a feature space of a classification system 
(abstract). Specifically, Bahl teaches calculating an entropy measure (column 5, lines 
55-67 and column 6, equations 2 and 3), wherein Bahl attempts to minimize entropy 
between feature vectors within classes and pick representative feature vectors for each 
class to speed up searching and comparison of new sounds (column 2, lines 8-24). It 
would have been obvious for one of ordinary skill in the art at the time of the invention to 
combine the teachings of Blum, Gjerdingen, and Bahl for the purpose of speeding up 
pattern classification systems. 

37. Regarding claim 35, the further limitation of claim 34, see the preceding 
argument with respect to claim 34. The combination of Blum, Gjerdingen, and Bahl 
teaches these features in a method of classifying data. 



Application/Control Number: 09/935,349 Page 13 

Art Unit: 2615 

Conclusion 

38. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure: 

Slaney, USPN 5,749,073 (previously introduced) - is used as evidence that 
gathering MFCC's are synonymous with critical band filtering; 

Pi et al., US PGPub 2004/0015357 (previously introduced) - is used as evidence 
that gathering MFCC's are synonymous with critical band filtering; 

Mauuary et al., USPN 6,157,909 (previously introduced) - is used as evidence 
that gathering MFCC's are synonymous with critical band filtering; 

Glaser et al., USPN 7,003,515 (previously introduced) - teaches classification 
using vectors (see Brief Summary, column 1-2); 

Logan et al., USPN 7,031 ,980 (previously introduced) - teaches different spectral 
representations of the input signal and MFCCs (see Detailed Description, column 5-6); 

Forbes.com "MongoMusic Fans Include Microsoft" (previously introduced) - 
teaches expert classification and DSP classification; and 

Snyder, Julene, The Industry Standard's Beat Sheet A Weekly Report on the 
Convergence of Music and the Net, "We Know What You Like: Can sites like 
MongoMusic and LAUNCHcast really tell your musical tastes?", 3/7/00, 
http://web.archive.org/web/200008241 12802/www.mongomusic.com/s/press_standard_ 
030700 - teaches "semi-automated, semi-human-based system" for classification of 
audio. 



Application/Control Number: 09/935,349 Page 14 

Art Unit: 2615 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to DANIEL R. SELLERS whose telephone number is 
(571)272-7528. The examiner can normally be reached on Monday to Friday, 9am to 
5:30pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Suhan Ni can be reached on (571 )272-7505. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/Daniel R. Sellers/ 
Examiner, Art Unit 2615 



/Suhan Ni/ 

Primary Examiner, Art Unit 2614 



